Phoneme-based vector quantization in a discrete HMM speech recognizer
نویسندگان
چکیده
The quantization distortion of vector quantization (VQ) is a key element that affects the performance of a discrete hidden Markov modeling (DHMM) system. Many researchers have realized this problem and tried to use integrated feature or multiple codebook in their systems to offset the disadvantage of the conventional VQ. However the computational complexity of those systems is then increased. Investigations have shown that the speech signal space consists of finite clusters that represent phoneme data sets from male and female speakers and reveal Gaussian distributions. In this paper we propose an alternative VQ method in which the phoneme is treated as a cluster in the speech space and a Gaussian model is estimated for each phoneme. A Gaussian mixture model (GMM) is generated by the expectation-maximization (EM) algorithm for the whole speech space and used as a codebook in which each code word is a Gaussian model and represents a certain cluster. An input utterance would be classified as a certain phoneme or a set of phonemes only when the phoneme or phonemes gave highest likelihood. A typical discrete HMM system was used for both phoneme and isolated word recognition. The results show that the phoneme-based Gaussian modeling vector quantization classifies the speech space more effectively and significant improvements in the performance of the DHMM system have been achieved.
منابع مشابه
Support Vector Machines for Postprocessing of Speech Recognition Hypotheses
In this paper, we introduce an approach to improve the recognition performance of a Hidden Markov Model (HMM) based monophone recognizer using Support Vector Machines (SVMs). We developed and examined a method for re-scoring the HMM recognizer hypotheses by SVMs in a phoneme recognition framework. Compared to a stand-alone HMM system, an improvement of 9.2% was reached on the TIMIT database and...
متن کاملLvq as a Feature Transformation for Hmms
We present a new way to take advantage of the dis-criminative power of Learning Vector Quantization in combination with continuous density hidden Markov models. This is based on viewing LVQ as a non-linear feature transformation. Class-wise quantization errors of LVQ are modeled by continuous density HMMs, whereas the practice in the literature regarding LVQ/HMM hybrids is to use LVQ-codebooks ...
متن کاملPhone vector DHMM to decode a phone recognizer's output
In this paper we introduce a Phone Vector Discrete HMM (PVDHMM) that decodes a phone recognizer’s output. The proposed PVDHMM treats a phone recognizer as a vector quantizer whose codebook size is equal to the size of its phone set. To examine the proposed method we perform two experiments. First, the output of a phone recognizer is recognized by the PVDHMM, and its results are compared with th...
متن کاملAdvanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems
This paper deals with the construction and optimization of a hybrid speech recognition system that consists of a combination of a neural vector quantizer (VQ) and discrete HMMs. In our investigations an integration of VQ based classi cation in the continuous classi er framework is given and some constraints are derived that must hold for the pdfs in the discrete pattern classi er context. Furth...
متن کاملImproving the performance of HMM-based very low bit rate speech coding
In this paper, we define an F0 quantization scheme for a very low bit rate speech coder based on HMM (Hidden Markov Model). In the coding system, the encoder carries out phoneme recognition, and transmits phoneme indices, state durations and F0 information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indices, and a sequence of mel-cepstral coefficient v...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 5 شماره
صفحات -
تاریخ انتشار 1997